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Abstract 

This  report  comprises  the  concluding  analysis  entitled  “Robust  Multispectral 
Imaging  Sensors  for  Autonomous  Robots”  of  the  project  as  well  as  three 
additional  documents.  First,  a  complete  report  comparing  alternative 
demosaicking  algorithms  for  early  processing  of  the  sensors  we  are  proposing. 
Second,  a  revision  of  a  PowerPoint  presentation  of  the  final  report  for  this  project, 
provided  in  3/01  for  ARO.  Third,  the  title  page  of  Mr.  Rajeev  Ramanath’s 
Master’s  Thesis,  which  was  accepted  in  8/01  by  NCSU. 
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Robust  Multispectral 
Imaging  Sensors  for 
Autonomous  Robots 

Rajeev  Ramanath,  Student  Member,  IEEE 
Wesley  E.  Snyder,  Senior  Member,  IEEE 
Griff  L.  Bilbro,  Senior  Member,  IEEE 
William  A.  Sander 

Abstract — Use  of  multispectral  capability  in  imaging  de¬ 
vices  provide  us  with  spectral  characteristics  of  the  objects 
being  viewed.  This  helps  in  “recognizing”  the  object  with¬ 
out  any  knowledge  of  the  geometry  or  shape  of  the  desired 
target.  Until  recently,  multispectral  images  were  obtained 
using  multiple  filters  placed  in  front  of  the  imaging  devices 
or  by  using  prism  based  beam-splitting  technologies.  With 
the  advent  of  Digital  Still  Color  Cameras,  a  whole  new  field 
of  imaging  has  emerged,  which  consists  of  a  monolithic  ar¬ 
ray  of  color  filters  overlaid  on  a  CCD  array  such  that  each 
pixel  samples  one  spectral  bsuid.  The  resulting  mosaic  of 
spectral  samples  is  processed  to  produce  a  high  resolution 
color  image  where  the  values  of  the  spectral  bands  not  sam¬ 
pled  at  a  certain  location  are  estimated  from  its  neighbors. 
This  process  is  often  referred  to  as  demosaicking^  This  pa¬ 
per  proposes  the  use  of  this  technology  as  a  general  imaging 
modality  for  robust  robot  vision  and  compares  several  de- 
mosaicking  algorithms. 

Index  Terms — Multispectral,  demosaicking,  demosaicing. 
Color  Filter  Array,  hexagonal  sampling 

I.  Introduction 

OBTAINING  multiple  spectral  samples  of  an  object 
provide  us  with  its  spectral  signature,  which  under 
many  circumstances  is  of  crucial  importance.  The  plethora 
of  information  obtained  from  a  color  picture  when  com¬ 
pared  to  a  grayscale  picture,  the  wealth  of  information  in 
obtaining  a  spectral  signature  of  an  enemy  tank  when  com¬ 
pared  to  an  integrated  monochromatic  image,  the  advan¬ 
tages  of  identifying  vegetation,  rocks,  minerals,  etc.  from 
a  sateUite  with  multispectral  capabiHty,  rather  than  sin¬ 
gle  channel  information  cannot  be  overstated.  In  fig.l,  we 
show  the  reflectivities  of  four  different  objects  [1],  the  iden¬ 
tification  of  which  are  critical  for  military  purposes;  espe¬ 
cially  for  target  seeking  missiles  or  reconnaissance  missions. 
Notice  that  the  visible  band  would  not  be  as  useful  as  a 
combination  of  visible  and  MWIR  bands  which  clearly  pro¬ 
vides  unique  spectral  signatures  for  camouflaged  military 
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vehicles.  Other  applications  for  multispectral  imagery  have 
been  found  in  glaciology,  hydrology,  volcanology,  geological 
surveys[2],  monitoring  urban  change[3]  etc. 


Fig.  1.  Reflectivities  of  a  few  targets  critical  for  detecting  camouflage 
in  battlefields 

The  classical  approach  to  multispectral  imaging  is  to  use 
separate  sensors  and  to  somehow  split  the  optical  path  to 
fall  on  those  sensors  in  a  manner  which  retains  registration 
(or  is  later  registered).  For  example  in  3  CCD  cameras, 
typically  the  fight  from  the  lens  passes  through  a  prism, 
is  split  into  three  rays,  then  through  three  different  filters 
and  onto  three  different  focal  plane  detectors  (see  fig.  2). 


Red  Sensor 


Blue  Sensor 

Object  Objective  Lens 


Fig.  2.  3  sensor  configuration  on  most  multispectral  cameras 

On  satellite  or  airborne  sensors,  the  optical  path  might 
be  deflected  by  a  moving  mirror  onto  an  array  of  sensors 
{e.g.  MODIS/ ASTER  use  “whiskbroom”  scanning  mir¬ 
rors).  Both  approaches  have  the  inherent  problem  of  reg¬ 
istration  of  the  images.  In  addition,  such  sensors  simply 
cannot  tolerate  the  vibrations  and  accelerations  of  a  mobile 
robot. 

In  [4]  reconnaissance  and  surveillance  robots  are  de¬ 
scribed,  each  provided  with  a  suite  of  sensors.  A  mosaicked 
set  of  sensors  as  proposed  here  would  be  of  immense  use 
in  such  missions,  providing  multispectral  capability  along 
with  ruggedness. 

Airborne  systems,  robot  platforms,  missiles,  etc.  are 
used  under  “high  stress”  situations  which  require  that  the 
imaging  system  perform  independent  of  the  systems’  vibra¬ 
tion  and  acceleration.  Multiple  sensor  systems  are  difficult 
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to  manufacture  with  this  capability.  Monolithic  color  fil¬ 
ter  array  based  systems  however  provide  this  stability  for 
imaging  while  maintaining  registration  as  it  is  an  inherent 
property  of  such  devices. 


(a)  (b) 

Fig.  3.  Bayer  Color  Filter  Array  (a)Classical  form  using  three  filters 
(b)  Similar  configuration  with  four  filters 

Commercially  available  Digital  Still  Color  Cameras 
(DSCs)  approach  the  problem  of  using  a  single  sensor  to 
capture  multispectral  data  by  making  use  of  a  monolithic 
array  of  color  filters  overlaid  on  a  CCD  array,  such  that 
each  pixel  records  only  one  sample  of  the  spectrum.  Three 
or  more  different  filters  (typically  three)  are  used  in  a  tes¬ 
sellated  fashion  as  shown  in  fig.3(a)  [5].  To  estimate  say, 
a  green  pixel  value  at  a  location  where  a  red  sample  was 
obtained,  we  interpolate  (demosaic)  and  obtain  this  infor¬ 
mation  from  the  neighbors.  A  variety  of  color  filter  array 
configurations  are  in  use  in  DSCs.  A  few  of  these  mosaics 
[6]  are  displayed  in  fig.4.  The  Bayer  configuration  is  how¬ 
ever  the  most  commonly  used. 

We  propose  that  mosaic  technology  is  a  robust  alter¬ 
native  to  existing  multispectral  imaging  modalities,  albeit 
with  some  loss  in  spatial  resolution. 

In  Section  2  we  illustrate  commonly  used  demosaicking 
algorithms  and  compare  their  advantages  and  disadvan¬ 
tages.  In  section  3,  we  use  experiments  to  illustrate  the 
trade-off  between  gaining  multispectral  resolution  over  loss 
in  spatial  resolution  due  to  mosaicking. 

II.  Demosaicking 

An  experiment  equivalent  to  that  done  with  mosaick¬ 
ing  visible  bands  has  not  been  done  in  the  infrared  (IR)  or 
other  spectral  bands.  Furthermore,  it  is  interesting  to  note 
that  we  are  not  constrained  to  three  bands  or  to  conven¬ 
tional  rectangular  pixels  or  to  the  human  visual  system. 
For  example,  we  could  simply  use  the  existing  Bayer  Ar¬ 
ray  configuration  and  use  four  spectral  bands  instead  of 
three  as  shown  in  fig.3(b).  In  fig.5(a),  we  illustrate  a  four- 
band  hexagonal  sampling  arrangement  for  which  we  are 
currently  building  hardware.  In  this  arrangement,  each 
pixel  has  exactly  two  neighbors  from  each  of  the  other 
bands. 

In  fig. 5(b),  we  illustrate  that  seven  bands  can  be  sensed 
in  this  way.  In  this  novel  7-band  hexagonal  arrangement, 
every  pixel  has  exactly  one  adjacent  neighbor  in  each  band, 
so  that  the  spectral  intensity  at  any  point  can  always  be 
estimated  from  data  no  farther  than  one  pixel  away  [7]. 
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Fig.  4.  Other  Color  Filter  Arrays  (a)Proposed  by  Yamanaka  (b) 
Green  channel  along  on  alternate  scan  lines  (c)  another  commonly 
used  configuration 


The  choice  of  spectral  bands  chosen  is  entirely  dependent 
upon  the  application. 

The  mosaic  of  samples  is  processed  to  produce  a  high 
resolution  multispectral  image  such  that  the  values  of  the 
color  bands  not  sampled  at  a  certain  location  are  estimated 
from  its  neighbors.  This  process  is  often  referred  to  as 
demosaicking.  A  variety  of  algorithms  for  demosaicking 
exist.  The  simplest  one  being  bilinear  interpolation.  An 
exhaustive  comparison  of  a  some  of  the  commonly  used 
demosaicking  methods  is  available  in  [8]. 

III.  Experiments 

In  this  section  we  compare  results  of  demosaicking  pro¬ 
cesses  on  different  images  using  several  different  methods 
described  in  [9], [10], [11], [12]  and  [13].  These  images  should 
give  the  reader  and  idea  about  the  pros  and  cons  of  this 
imaging  modality,  in  each  of  the  two  types  of  images;  those 
in  the  visible  region  and  the  IR  region  of  the  spectrum. 
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(b) 

Fig.  5.  Hexagonal  configurations  of  color  mosaics  (a)  Arrangement 
of  four  bandpass  filters  where  each  pixel  has  exactly  two  neighbors 
from  other  spectral  bands  (b)  A  similar  arrangement  in  which  seven 
spectral  bands  are  measured,  using  seven  filters,  and  a  multispectral 
estimate  may  be  obtained  from  immediate  neighbors 

The  Mean  Squared  Error  (MSE)  is  given  by 

1  3  M,JV 

-  fii,j,k))  (1) 

A;=l  i,j=l 

where  is  the  original  image  pixel  intensity  at  pixel 

location  {ij)  in  band  k  and  g{iij,k)  is  the  estimated  image 
pixel  intensity  at  pixel  location  (i,  j)  in  band  k. 

Although  the  MSE  metric  is  not  without  limitations,  es¬ 
pecially  when  used  as  a  global  measure  of  image  fidelity,  it 
used  due  to  its  ease  in  implementation.  We  hence  compare 
the  algorithms  with  two  different  metrics,  the  first  one  be¬ 
ing  the  mean  squared  error  and  the  other,  the  metric 
[14],  which  is  the  measured  error  in  the  CIE--L*a*b*  color 
space,  one  of  the  many  perceptually  uniform  color  spaces. 
Errors  in  perceptually  uniform  color  spaces  measure  errors 
that  human  observers  perceive  (which  is  meaningful  for 
images  in  the  visible  spectrum). 

A.  Visible  Spectrum 

The  images  in  the  visible  region  of  the  spectrum  were 
obtained  using  two  different  methods; 

♦  simulation  of  the  color  filter  array  sampling  for  which 
we  used  data  obtained  from  the  hyperspectral  Image 
dataset  [15].  The  results  are  presented  in  fig.6. 

«  a  Pulnix  TMC-1001  camera  (outputs  mosaicked  im¬ 
ages).  Figs.  7  and  8  show  results  of  demosaicking 
these  images. 

Table  I  shows  the  error  metrics  described  above  on  these 
images.  For  the  purposes  of  human  viewing,  image  qual¬ 


ity  metrics  in  perceptually  uniform  color  spaces  or  other 
metrics  that  measure  image  fidelity  would  be  preferred  for 
comparison. 

In  general,  visible  spectrum  images  have  two  kinds  of 
errors, 

•  zipper  effect  errors  that  occur  along  intensity  edges  as 
seen  in  fig.7 

•  confetti  errors  that  occur  at  isolated  pixels  that  are 
at  very  high  intensity  compared  to  their  neighbors,  as 
seen  in  fig.8. 

These  errors  are  “disturbing”  to  a  human  observer  when 
viewing  color  images  of  natural  scenes  or  scenes  about 
which  humans  have  prior  knowledge.  Notice  that  the  var¬ 
ious  algorithms  have  been  developed  with  edge  enhance¬ 
ment  being  the  primary  criterion.  Images  from  left  to  right 
in  fig.7  are  in  chronological  order  of  development  in  this 
field. 

B.  IR  Spectrum 

The  images  in  the  IR  region  of  the  spectrum  were  ob¬ 
tained  from  the  MASTER  dataset  [16]  by  subsampling  the 
multispectral  images  using  a  Bayer  sampling  array.  It  is 
to  be  emphasized  that  this  a  simulation  of  multispectral 
mosaic  imaging  in  IR.  Mosaicked  data  obtained  from  a 
mosaicked  IR  sensor  is  not  available  at  this  time. 

The  results  of  demosaicking  using  algorithms  devised  for 
the  visible  spectrum  are  shown  in  fig.9.  In  general,  we  ob¬ 
serve  a  similar  trend  in  errors  as  found  in  visible  spectrum 
images, 

•  zipper  effect  errors.,  that  arise  due  to  sharp  intensity 
edges 

•  confetti  errors,  that  arise  due  to  the  “low”  spatial  res¬ 
olution  of  these  images. 

Satelhte  /  airborne  imagery  has  the  inherent  drawback 
that  the  images  are  of  relatively  poor  resolution  when  com¬ 
pared  to  those  of  commercial  available  DSCs.  This  gives 
rise  to  confetti  errors  when  rendered  for  human  vision. 
However,  it  needs  to  be  noted  that  these  images  are  not 
to  be  viewed  by  an  observer  and  appreciated  for  quality, 
rather  these  images  will  be  used  by  target-seeking  missies 
or  parameter-monitoring  systems  which  will  use  the  spec¬ 
tral  information  in  these  images  to  perform  recognition  or 
other  similar  decision  tasks.  Multispectral  capability  is 
most  useful  when  the  targets  are  minimally  resolved  as  is 
the  case  in  satellite  and  airborne  imagery.  The  bounds  on 
“minimal”  resolution  are  being  researched  and  we  hope  to 
publish  those  results  in  the  near  future. 

Table  I  shows  average  error  metrics,  averaged  over  dif¬ 
ferent  regions  in  these  images.  The  Peak  Signal  to  Noise 
ratio  (PSNR)  is  given  by 

PSNR=mog,o-jf^  (2) 

where  MSE  is  the  mean  squared  error.  The  images  are  all 
scaled  between  0  and  1. 

We  need  to  bear  in  mind  that  for  decision-oriented  or 
recognition  systems,  ROC  curves  would  give  a  better  mea¬ 
sure  of  performance  of  these  algorithms.  But,  due  to  the 


4 


Fig.  6.  Simulation  of  DSC  Images  (a)  Original  Image  (b)Linear  Interpolation  (c)Cok  Interpolation  (d)Freeman  Interpolation  (e)  Laroche 
Interpolation  (f)Hamiltonr Adams  Interpolation 


Fig.  7.  DSC  Images  illustrating  zipper  effect  (a)  Original  Image  (b)Linear  Interpolation  (c)Cok  Interpolation  (d)Freeman  Interpolation  (e) 
Laroche  Interpolation  (f)Hami Iton- Adams  Interpolation 


(a)  (b)  (c)  (d)  (e)  (f) 

Fig.  8.  DSC  Images  illustrating  confetti  effect  (a)  Original  Image  (b)Linear  Interpolation  (c)Cok  Interpolation  (d)Freeman  Interpolation 
(e)  Laroche  Interpolation  (f)Hami Iton- Adams  Interpolation 


Fig.  9.  IR  Image  Mosaics;  band  39(Red),  33(Green),  24(Blu€)  (a)  Original  Image  (b)Linear  Interpolation  (c)Cok  Interpolation  (d)Freeman 
Interpolation  (e)  Laroche  Interpolation  (f)Hamilton- Adams  Interpolation 


wide  applicability  of  this  imaging  modality,  we  restrict  our-  IV.  CONCLUSIONS 

selves  to  the  MSE  metric.  errors  quantified  in  table  I  give  an  idea  about 

the  performance  of  these  algorithms.  Clearly,  Hamilton- 
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Algorithm 

Hyperspectral 

MASTER 

used 

Image  dataset 

Images 

AErgb 

PSNR 

(dB) 

MSB 

PSNR 

(dB) 

Linear 

6.5766 

0.0115 

19.39 

0.0570 

12.92 

Cok 

3.1777 

0.0024 

26.19 

0.0416 

13.81 

Freeman 

2.5263 

0.0011 

29.59 

0.0461 

13.36 

Laroche- 

Prescott 

1.6681 

0.00098 

30.09 

0.0410 

13.87 

Hamilton- 

Adams 

1.2473 

0.00045 

33.48 

0-0339 

14.69 

TABLE  I 


Errors  for  different  interpolation  algorithms  on  the 
Hyperspectral  Image  data-set  and  MASTER  Images  after 
demosaicking 


Adams’  algorithm  performs  best  under  the  MSE  metric; 
which  conforms  to  the  visual  appearance  of  the  resulting 
images  (the  lowest  AJ5*^  error,  making  the  errors  least  de¬ 
tectable  by  a  human  observer)  and  is  further  corroborated 
by  an  increasing  PSNR. 

Although  we  “lost”  spatial  resolution  in  using  the  Bayer 
Array  by  using  |  the  number  of  green  sensors  and  |  the 
number  of  red  and  blue  sensors,  we  were  able  to  reconstruct 
the  full  resolution  images  with  good  clarity,  gaining 

«  robustness  and 

•  the  use  of  only  one  array  of  sensors. 

Table  II  summarizes  the  advantages  and  disadvantages 
of  this  imaging  modality. 


Multiple 

Sensors 

Mosaicked 

Sensor  Array 

Registration 

can  be  difficult 

not  required 

Mechanical 

Robustness 

sensitive 

highly  robust 

No.  of  sensors 

for  mxn  size 
RGB  images 

mn  R,  mn  G,  mn  B 

mn  mn  q  mn  ^ 

(3-band  Bayer) 

No.  of  sensor 
arrays  for  3 

bands 

3 

1(3- band  Bayer) 

No.  of  sensor 
arrays  for  7 

bands 

7 

1  (7-band  Hexagonal) 

TABLE  II 

Pros  and  cons  of  Mosaicking 


beyond  other  existing  multispectral  imaging  systems  for 
use  in  robotic,  satellite-based,  field-portable  or  hand-held 
systems. 

The  average  loss  observed  by  mosaicking  sensors  clearly 
suggests  that  the  loss  may  be  decreased  by  using  “better” 
estimation  /  interpolation  methods. 

We  are  currently  posing  the  process  of  demosaicking  as 
an  optimization  problem  with  a  cost  function  such  that  it 
minimizes  the  errors  obtained  due  to  mosaicking,  using  a 
variety  of  priors  which  will  impose  certain  properties  on 
the  images  while 

•  maximizing  spectral  overlap  between  observed  target 
and  “expected  target”.  In  other  words,  minimizing 
the  “error”  (however  it  be  defined)  between  the  ob¬ 
served  and  known  spectral  signatures  of  the  target. 

•  maximizing  probability  of  detection  of  the  target  (in 
ATR  applications) 

We  anticipate  presenting  results  of  this  work  at  the  con¬ 
ference. 
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Although  there  are  performance  bounds  on  the  trade-off 
between  loss  in  spatial  resolution  and  the  gain  in  spectral 
resolution,  such  a  robust  system  has  potential  benefits  far 
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Demosaicking  methods  for  Bayer  Color  Arrays 

Rajeev  Ramanath^  Wesley  E.  Snyder^  Griff  L.  Bilbro^  William  Sander^ 

Abstract 

Digital  Still  Color  Cameras  sample  the  color  spectrum  using  a  monolithic  array  of  color 
filters  overlaid  on  a  CCD  array  such  that  each  pixel  samples  only  one  color  hand.  The 
resulting  mosaic  of  color  samples  is  processed  to  produce  a  high  resolution  color  image  such 
that  the  values  of  the  color  bands  not  sampled  at  a  certain  location  are  estimated  from 
its  neighbors.  This  process  is  often  referred  to  as  demosaicking.  This  paper  introduces  and 
compares  a  few  commonly  used  demosaicking  methods  using  error  metrics  like  mean  squared 
error  (MSE)  in  the  RGB  color  space  and  perceived  error  in  the  CIE-L*a*h*  color  space. 

Index  Terms 

Demosaicking,  Demosaicing,  Color  Filter  Arrays,  Digital  Color  Camera,  DSC,  Interpo¬ 
lation 


L  Introduction 


Commercially  available  Digital  Stlll  Color  Cameras  (DSC)  are  based  on  a  single 
CCD  array  and  capture  color  Information  by  using  three  or  more  color  filters,  each 
sample  point  capturing  only  one  sample  of  the  color  spectrum.  The  Bayer  Array  [1]  (shown 
in  fig.l)  Is  one  of  the  many  realizations  of  color  filter  arrays  (CFA)  possible.  Many  other 
Implementations  of  a  color-sampling  grid  have  been  incorporated  in  commercial  cameras, 
most  using  the  principle  that  the  luminance  channel  (green)  needs  to  be  sampled  at  a  higher 
rate  than  the  chrominance  channels  (red  and  blue).  The  choice  for  green  as  “representative” 

^  Rajeev  Ramanath,  Wesley  Snyder  and  Griff  Bilbro  are  with  the  Department  of  Electrical  and  Computer  Engi¬ 
neering  at  North  Carolina  State  University,  Raleigh,  NC,  27695-7914,  USA.  Phone:  (919)  513-2007,  email:  {rramanaj 
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^WiUiam  Sander  is  with  the  U.S.  Army  Research  Office,  Durham,  P.O.  Box  12211,  Research  Triangle  Park,  NC 
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Fig.  1.  Sample  Bayer  Pattern 

of  the  luminance  is  due  to  the  fact  that  the  luminance  response  curve  of  the  eye  peaks  at 
around  the  frequency  of  green  light  (see  fig.  2). 


Fig.  2.  Luminous  Efficiency  function  of  human  observer.  Note:  peak  is  at  around  frequency  of  green  light 

Since,  at  each  pixel,  only  one  spectral  measurement  was  made,  the  other  colors  must  be 
estimated  using  information  from  aU  the  color  planes  in  order  to  obtain  a  high  resolution  color 
image.  This  process  is  often  referred  to  as  demosaicking.  Interpolation  must  be  performed 
on  the  mosaicked  image  data.  There  are  a  variety  of  methods  available,  the  simplest  being 
Unear  interpolation,  which,  as  shall  be  shown,  does  not  maintain  edge  information  well.  More 
complicated  methods  [2],  [3],  [4],  [5],  [6]  perform  this  interpolation  and  attempt  to  maintain 
edge  detail  or  limit  hue  transitions.  In  [7],  IVussell  introduces  a  linear  lexicographic  model 
for  the  image  formation  and  demosaicking  process,  which  may  be  used  in  a  reconstruction 
step.  In  [8],  linear  response  models  proposed  by  Vora  et.al  [9]  have  been  used  to  reconstruct 
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these  mosaicked  images  using  an  optimization  technique  called  Mean  Field  Annealing  [10]. 
In  this  paper  we  briefly  describe  the  more  commonly  used  demosaicking  algorithms  and 
demonstrate  their  strengths  and  weaknesses.  In  Section  II,  we  describe  the  interpolation 
methods  we  use  in  our  comparisons.  We  compare  the  interpolation  methods  by  rimning  the 
algorithms  on  three  types  of  images  (two  types  of  synthetic  image  sets  and  one  set  of  real- 
world  mosaicked  images).  The  images  used  for  comparison  and  their  properties  are  presented 
in  section  III.  Qualitative  and  quantitative  results  are  presented  in  Section  FV.  Discussions 
about  the  properties  of  these  algorithms  and  their  overall  behavior  are  presented  in  Section 
V.  We  use  two  error  metrics,  the  mean  squared  error  in  the  RGB  color  space  and  the  AE*,, 
error  in  the  CIE-L*a*b*  color  space  (described  in  the  Appendix). 

II.  Demosaicking  Strategies 

A.  Ideal  Interpolation 

Sampling  of  a  continuous  image  f{x,y)  yields  infinite  repetitions  of  its  continuous  spec¬ 
trum  F{C,'n)  in  the  Fourier  domain.  If  these  repetitions  do  not  overlap  (which  is  almost 
never  the  case  as  natural  images  are  not  b^d-limited),  the  original  image  f{x,y)  can  be  re¬ 
constructed  exactly  from  its  discrete  samples  f{m,n),  otherwise  we  observe  the  phenomenon 
of  aliasing.  The  1-D  “ideal”  interpolation  is  the  multiplication  with  a  rect  function  in  the 
frequency  domain  and  can  be  realized  in  the  spatial  domain  by  a  convolution  with  the  sine 
function.  This  “ideal  interpolator”  kernel  is  band-limited  and  hence  is  not  space  limited.  It 
is  primarily  of  theoretical  interest  and  not  implemented  in  practice  [11]. 

B.  Neighborhood  considerations 

It  may  be  expected  that  we  get  better  estimates  for  the  missing  sample  values  by  in¬ 
creasing  the  neighborhood  of  the  pixel,  but  this  increase  is  computationally  expensive.  There 
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is  hence  a  need  to  keep  the  interpolation  filter  kernel  space-Mmited  to  a  small  size  and  also 
extract  as  much  information  from  the  neighborhood  as  possible.  To  this  end,  correlation 
between  color  channels  is  used  [12].  For  RGB  images,  cross-correlation  between  channels 
has  been  determined  and  found  to  vary  between  0.25  and  0.99  with  averages  of  0.86  for 
red/green,  0.79  for  red/blue  and  0.92  for  green/blue  cross  correlations  [13].  One  well-known 
image  model  [12]  is  to  simply  assume  that  red  and  blue  are  perfectly  correlated  with  the 
green  over  a  small  neighborhood  and  thus  differ  from  green  by  only  an  offset.  This  image 
model  is  given  by 

Gij  =  Rij  +  k  (1) 

where  (i,  j)  refers  to  the  pixel  location,  R  (known)  and  G  (unknown)  the  red  and  green  pixel 
values,  k  is  the  appropriate  bias  for  the  given  pixel  neighborhood.  The  same  apphes  at  a 
blue  pixel  location.  Let  us  illustrate  eqn.l  with  an  example  by  considering  the  green  channel 
of  an  image  and  the  corresponding  Green  minus  Red  and  Green  minus  Blue  channels.  In 
fig.3,  we  can  see  that  majority  of  the  regions  in  the  Green  minus  Red  and  Green  minus  Blue 
images  are  locally  uniform  (the  constant  k  in  eqn.l),  especially  in  regions  where  there  is  high 
spatial  detail  (near  the  eyes  of  the  macaws,  say). 

Hence  the  choice  of  the  neighborhood  size  is  critical.  It  is  observed  that  most  imple¬ 
mentations  are  designed  with  hardware  implementation  in  mind  (paying  great  attention  to 
the  need  for  pipelining,  system  latency  and  throughput  per  clock  cycle).  The  larger  the 
neighborhood,  the  greater  the  difficulty  in  pipehning,  the  greater  the  latency,  and  possibly, 
lesser  the  throughput. 

C.  Bilinear  Interpolation 

Consider  the  array  of  pixels  as  shown  in  fig.l.  At  a  blue  center  (where  blue  color  was 
measured),  we  need  to  estimate  the  green  and  red  components.  Consider  pixel  location  44 
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(c)  (d) 

Fig.  3.  (a)  RGB  image  b)  Green  Channel  c)  Green  minus  Red  (d)Green  minus  Blue 

at  which  only  B44  is  measured;  we  need  to  estimate  G44.  Given  G34,  G43,  G45,  G54,  one 
estimate  for  G44  is  given  by  G44  =  (G34  +  G43  +  G45  +  G54)/ 4.  To  determine  Rw,  given  i?33, 
Rih:  R53,  R55,  the  estimate  for  R44  is  given  by  R44  =  {R33  +  R35  +  R53  +  R^s)/^-  At  a  red 
center,  we  would  estimate  the  blue  and  green  accordingly.  Performing  this  process  at  each 
photo-site  (location  on  the  CCD),  we  can  obtain  three  color  planes  for  the  scene  which  would 
give  us  one  possible  demosaicked  form  of  the  scene. 

The  band-limiting  nature  of  this  interpolation  smooths  edges,  which  shows  up  in  color 
images  as  fringes  (referred  to  as  the  zipper  effect  [12],  [14]).  This  has  been  illustrated  with 
two  colors  channels  (for  simplicity)  in  fig.4. 
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(a)  (b)  (c) 

Fig.  4.  Illustration  of  fringe  or  zipper  effect  resulting  from  the  linear  interpolation  process.  An  edge  is 
illustrated  as  going  from  navy  blue  (0,0,128)  to  yellow  (255,255,128).  The  zipper  effect  produces  green  pixels 
near  the  edge  (a)  Original  image  (only  2  colors,  blue  constant  at  128)  (b)  one  scan  line  of  subsampled  Bayer 
pattern  (choose  every  other  pixel)  (c)  result  of  estimating  missing  data  using  linear  interpolation.  Observe 
color  fringe  in  locations  5  and  6 


D.  Constant  hue-based  Interpolation 

In  general,  hue  is  defined  as  the  property  of  colors  by  which  they  can  be  perceived  as 
ranging  from  red  through  yellow,  green,  and  blue,  as  determined  by  the  dominant  wavelength 
of  the  light.  Constant  hue-based  Interpolation,  proposed  by  Cok  [2]  and  is  one  of  the  first 
few  methods  used  in  commercial  camera  systems.  Modifications  of  this  system  are  still  in 
use.  The  key  objection  to  pixel  artifacts  in  images  that  result  from  bilinear  interpolation 
is  abrupt  and  unnatural  hue  change  [2].  There  is  a  need  to  maintain  the  hue  of  the  color 
such  that  there  are  no  sudden  jumps  in  hue  (except  for  over  edges,  say).  The  red  and  blue 
channels  are  assigned  to  be  the  chrominance  channels  while  the  green  channel  is  assigned  as 
the  luminance  channel. 

As  used  in  this  section,  hue  is  defined  by  a  vector  of  ratios  as  {R/ G,  B/ G)  [2] .  It  is  to 
be  noted  that  the  term  hue  defined  above  is  valid  for  this  method  only,  also,  the  hue  needs 
to  be  “redefined”  if  the  denominator  G  is  zero.  By  interpolating  the  hue  value  and  deriving 
the  interpolated  chrominance  values  (blue  and  red)  from  the  interpolated  hue  values,  hueji 
are  allowed  to  change  only  gradually,  thereby  reducing  the  appearance  of  color  fringes  which 
would  have  been  obtained  by  interpolating  only  the  chrominance  values. 
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Consider  an  image  with  constant  hue.  In  exposure  space  (be  it  logarithmic^  or  hnear), 
the  values  of  the  luminance  (G)  and  one  chrominance  component  {R,  say)  at  a  location  {i,j) 
and  a  neighboring  sample  location  (k,l)  are  related  as  Rij/ Rki  =  Gy/ Gki  if  Bij/ Bki  =  Gij/Gki- 
If  Rki  represents  the  unknown  chrominance  value,  and  Rij  and  Gij  represent  measured 
values  and  Gki  represents  the  interpolated  luminance  value,  the  missing  chrominance  value 
Rki  is  given  by  Rki  =  Gki{Rij/Gij)  In  an  image  that  does  not  have  uniform  hue,  as  in  a 
typical  color  image,  smoothly  changing  hues  are  assured  by  interpolating  the  hue  values 
between  neighboring  chrominance  values. 

The  green  channel  is  first  interpolated  using  bilinear  interpolation.  After  this  first  pass, 
the  hue  is  interpolated.  Referring  to  fig.l. 


R44  —  G44 


Rz3  ,  R35  ,  R53  ,  R55 
G^s  G35  G53  G55 


and  similarly  for  the  blue  chaimel 


^33  —  G3 


B22  .  B24  .  B42  B44 
G22  G24  G42  G44 


The  G  values  in  bold-face  are  estimated  values,  after  the  first  pass  of  interpolation.  The  ex¬ 
tension  to  the  logarithmic  exposure  space  is  straightforward  as  multiplications  and  divisions 
in  the  hnear  space  become  additions  and  subtractions,  respectively  in  the  logarithmic  space. 
There  is  a  caveat  however  as  interpolations  will  be  performed  in  the  logarithmic  space  and 
hence  the  relations  in  linear  space  and  exposure  space  are  not  identical  [2].  Hence  in  most 
implementations  the  data  is  first  hnearized  [15]  and  then  interpolated  as  described  above. 

^  Most  cameras  capture  data  in  a  logarithmic  exposure  space  and  need  to  be  linearized  before  the  ratios  used  as 
such.  If  interpolating  in  the  logarithmic  exposure  space,  difference  of  logarithms  needs  to  be  taken  instead  of  ratios; 
i.e.  log(Rij/Rki)  =  log(Rij)  -log(Rki) 
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E,  Median-based  Interpolation 

This  method,  proposed  by  Freeman  [3],  is  a  two  pass  process,  the  first  being  a  hnear 
interpolation,  and  the  second  pass  a  median  filter  of  the  color  differences. 

In  the  first  pass,  linear  interpolation  is  used  to  populate  each  photo-site  with  all  three 
colors  and  in  the  second  pass,  the  difference  image,  of  say.  Red  minus  Green  and  Blue  minus 
Green  is  median  filtered.  The  median  filtered  image  thus  obtained  is  then  used  in  conjunction 
with  the  original  Bayer  array  samples  to  recover  the  samples  (illustrated  below).  This  method 
preserves  edges  very  well,  as  illustrated  in  fig.5  where  only  one  row  of  the  Bayer  array  is 
considered  since  this  process  can  be  extrapolated  to  the  case  of  the  rows  containing  blue  and 
green  pixels.  Fig.5(a)  shows  one  scan  line  of  the  original  image  before  Bayer  subsamphng,  the 
horizontal  axis  is  the  location  index  and  the  vertical  axis  represents  intensity  of  red  and  green 
pixels.  We  have  a  step  edge  between  locations  5  and  6.  Fig.5(b)  shows  the  same  scan  line, 
sampled  in  a  Bayer  fashion,  picking  out  every  other  pixel  for  red  and  green.  Fig.5(c)  (step  1 
of  this  algorithm)  shows  the  result  of  estimating  the  missing  data  using  linear  interpolation. 
Notice  the  color  fringes  introduced  between  pixel  locations  5  and  6;  fig.5(d)  (step  2)  shows 
the  absolute  valued  difference  image  between  the  two  channels;  fig.5(e)  (step  3)  shows  the 
result  of  median  filtering  the  difference  image  with  a  kernel  of  size  5.  Using  this  result  and 
the  sampled  data,  fig.5(f)  is  generated  (step  4)  as  an  estimate  of  the  original  image  (by 
adding  the  median  filtered  result  to  the  sampled  data,  e.g.  the  red  value  at  location  6  is 
estimated  by  adding  the  median  filtered  result  at  location  6  to  the  sampled  green  value  at 
location  6).  The  reconstruction  of  the  edge  in  this  example  is  exact,  although  note  that  for 
a  median  filter  of  size  3,  this  will  not  be  the  case. 

This  concept  can  be  carried  over  to  three  color  sensors  wherein  differences  are  calculated 
between  pairs  of  colors  and  the  median  filter  is  apphed  to  these  differences  to  generate  the 
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(d)  (e)  (f) 

Fig.  5.  Illustration  of  FVeeman’s  interpolation  method  for  a  two  channel  system,  as  in  fig.4  an  edge  is 
illustrated  as  going  from  navy  blue  (0,0,128)  to  yellow  (255,255,128)  (a)  Original  image  (only  2  colors,  blue 
constant  at  128)  (b)  one  scan  line  of  subsampled  Bayer  pattern  (choose  every  other  pixel)  (c)  result  of  linear 
interpolation  (d)  Green  minus  Red  (e)  median  filtered  result  of  the  difference  image  (f)  reconstructed  image 


final  image. 

We  shall  consider  neighborhoods  of  a  size  such  that  all  the  algorithms  can  be  compared 
on  the  same  basis.  The  algorithms  described  in  this  document  have  at  most  9  pixels  under 
consideration  for  “estimation” .  In  a  square  neighborhood,  this  would  imply  a  3  x  3  window. 
We  shall  hence  use  a  3  x  3  neighborhood  for  Freeman’s  algorithm. 


F.  Gradient  Based  Interpolation 

This  method  was  proposed  by  Laroche  and  Prescott  [4]  and  is  in  use  in  the  Kodak 
DCS  200  Digital  Camera  System.  It  employs  a  three  step  process,  the  first  one  being  the 
interpolation  of  the  luminance  channel  (green)  and  the  second  and  third  being  interpolation 
of  the  color  differences  (red  minus  green  and  blue  minus  green).  The  interpolated  color 
differences  are  used  to  reconstruct  the  chrominance  channels  (red  and  blue).  This  method 
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takes  advantage  of  the  fact  that  the  human  eye  is  most  sensitive  to  luminance  changes. 
The  interpolation  is  performed  depending  upon  the  position  of  an  edge  in  the  green  chan¬ 
nel.  Referring  to  fig.l,  if  we  need  to  estimate  G44,  let  a  =  ahs((542 +546)/2  — £44)  and 
/?  =  abs((524  + 564)72-544).  We  refer  to  a  and  /?  as  “classifiers”  and  will  use  them  to 
determine  if  a  pixel  belongs  to  a  vertical  or  horizontal  edge,  respectively.  It  is  intriguing 
to  note  that  the  classifiers  used  are  second  derivatives  with  the  sign  inverted  and  halved  in 
magnitude.  We  come  up  with  the  following  estimates  for  the  missing  green  pixel  value. 

(  G43  +  G45 


G44=< 


2 

G34  +  G54 

2 

G43  +  G45  +  G34  +  G54 


if  a<P 
if  q:  >  /3 
if  a  =  /? 


(4) 


Similarly,  for  estimating  G33,  let  a  —  abs((53i  +  Rzh)l‘^  —  Rzz)  and  ^  =  abs((5i3  +  Rh^l‘1  — 
R33) .  These  are  estimates  to  the  horizontal  and  vertical  second  derivatives  in  red,  respec¬ 
tively.  Using  these  gradients  as  classifiers,  we  come  up  with  the  following  estimates  for  the 
missing  green  pixel  value. 

G32  +  G34 


G33  —  s 


G23  +  G43 


G32  +  G34  +  G23  +  G43 


if  a<  ^ 

if  a>  P 

if  a  =  P 


(5) 


Once  the  luminance  is  determined,  the  chrominance  values  are  interpolated  from  the  differ¬ 
ences  between  the  color  (red  and  blue)  and  luminance  (green)  signals.  This  is  given  by 
534  =  (^33-^33) +  (535 -035)^^^ 

J-,  (533  —  G33)  +  (535  —  G35)  ^  ^ 

543  =  - -  I-  tj43 

^  _  (533  ~  G33)  +  {R35  —  G35)  +  {R53  —  G53)  +  {R55  —  G55)  ^ 


(6) 
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Note  that  the  green  channel  has  been  completely  estimated  before  this  step.  The  bold-face 
entries  correspond  to  estimated  values.  We  get  corresponding  formulae  for  the  blue  pixel 
locations.  Interpolating  color  differences  and  adding  the  green  component  has  the  advantage 
of  maintaining  color  information  and  also  using  intensity  information  at  pixel  locations.  At 
this  point,  three  complete  RGB  planes  are  available  for  the  full  resolution  color  image. 

G.  Adaptive  Color  Plan  Interpolation 

This  method  is  proposed  by  Hamilton  and  Adams  [5].  It  is  a  modification  of  the  method 
proposed  by  Laroche  and  Prescott  [4|.  This  method  also  employs  a  multiple  step  process, 
with  classifiers  similar  to  those  used  in  Laroche-Prescott’s  scheme  but  modified  to  accom¬ 
modate  first  order  and  second  order  derivatives.  The  estimates  are  composed  of  arithmetic 
averages  for  the  chromaticity  (red  and  blue)  data  and  appropriately  scaled  second  derivative 
terms  for  the  luminance  (green)  data.  Depending  upon  the  preferred  orientation  of  the  edge, 
the  predictor  is  chosen.  This  process  also  has  three  nms.  The  first  run  populates  that  lu¬ 
minance  (green)  channel  and  the  second  and  third  runs  populate  the  chrominance  (red  and 
blue)  channels. 

Consider  the  Bayer  array  neighborhood  shown  in  fig.6(a).  Gi  is  a  green  pixel  and  Ai 
is  either  a  red  pixel  or  a  blue  pixel  (All  Ai  pixels  will  be  the  same  color  for  the  entire 
neighborhood).  We  now  form  classifiers  a  =  abs(— A.3  -p  2A5  —  A7)-l-abs(G4  — Ge)  and  P  = 
abs(— Al  +  2 As  —  Ag)  -I-  abs(G2  —  Gs).  These  classifiers  are  composed  of  second  derivative 
terms  for  chromaticity  data  and  gradients  for  the  luminance  data.  As  such,  these  classifiers 
sense  the  high  spatial  frequency  information  in  the  pixel  neighborhood  in  the  horizontal  and 
vertical  directions. 

Consider,  that  we  need  to  estimate  the  green  value  at  the  center,  i.e.  to  estimate  G5. 
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Fig.  6.  Sample  Bayer  Neighborhood,  ki  =  Chrominance  (blue  /  red),  Gf  =  Luminance,  C5  =  red  /  blue 


Depending  upon  the  preferred  orientation,  the  interpolation  estimates  are  determined  as 


G4  +  Gq  —A^  +  2.A5  — 

2  ^  2 

G2  +  Gg  ~Ai  +  2A5  —  Ag 
_  ^  _ 


if  a<P 


if  a>  p 


I  G2  +  G4  +  Gg  +  Gg  ^  ~.d.r  —  d.3  +  4A5  —  >1.7  —  Ag  ^ ^ 

V  4  8 

These  predictors  are  composed  of  arithmetic  averages  for  the  green  data  and  appropriately 
scaled  second  derivative  terms  for  the  chromaticity  data.  This  comprises  the  first  pass  of 
the  interpolation  algorithm.  The  second  pass  involves  populating  the  chromaticity  channels. 
Consider  the  neighborhood  as  shown  in  fig.6(b).  Gi  is  a  green  pixel  and  Ai  is  either  a  red 
pixel  of  a  blue  pixel  and  G*  is  the  opposite  chromaticity  pixel.  Then  Ag  =  (>li  +^3)72  + 
(— Gi  +2G2  -  Ga) /2,  A4  =  (Al  +  ^7)72  +  (— Gi  +2G4  -  Gr)/2.  These  are  used  when  the 
nearest  neighbors  to  Ai  are  in  the  same  row  and  colunm  respectively. 

To  estimate  Gs,  we  employ  the  same  method  as  we  did  to  estimate  the  luminance 
channel.  We  again,  form  two  classifiers,  a  and  P  which  “estimate”  the  gradient  in  the 
horizontal  and  vertical  directions,  a:  =  abs(— G3  +  2G5  —  G7)  +  abs(A3  —  At)  and  P  = 
abs(— Gi  +  2G5  —  Go)  +  abs(Ai  -  Ag).  a  and  P  “sense”  the  high  frequency  information 
in  the  pixel  neighborhood  in  the  positive  and  negative  diagonal  respectively.  We  now  have 
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estimates 


c,  =  { 


(  Az-]-Ai  — G3  +  2G5  — Gr 
2  ^  2 
Ai+Aq  — Gi -j- 2C?5  —  Gg 


if  a<  p 
if  a>  P 


(8) 


Al  +.A3  +  2i7  +  ^9  —Gi  —  Gs  +  4G5  —  Gr  —  Gg  q.  =  ^ 


These  estimates  are  composed  of  arithmetic  averages  for  the  chromaticity  data  and  appro¬ 
priately  scaled  second  derivative  terms  for  the  green  data.  Depending  upon  the  preferred 
orientation  of  the  edge,  the  predictor  is  chosen.  We  now  have  the  three  color  planes  populated 
for  the  Bayer  Array  data. 


III.  Comparison  of  Interpolation  Methods 

We  generated  test  images,  shown  in  fig.7  and  fig.8  which  are  simulations  of  the  data 
contained  in  the  Bayer  Array  of  the  camera.  In  other  words,  these  are  images  that  consider 
“what-if”  cases  in  the  Bayer  Array.  They  were  chosen  as  test  images  to  emphasize  the 
various  details  that  each  algorithm  works  on. 


A.  Type  I  Test  Images 

Images  of  this  type  are  synthetic  and  have  edge  orientations  along  both  the  cardinal 
directions  as  well  as  in  arbitrary  directions  as  shown  in  fig.7.  Test  Imagei  was  chosen  to 
demonstrate  the  artifacts  each  process  introduces  for  varying  thicknesses  of  stripes  (increas¬ 
ing  spatial  frequencies).  Test  Image2  was  chosen  to  study  a  similar  performance,  but  with  a 
constant  spatial  frequency.  Test  Images  is  a  section  from  the  starburst  pattern,  to  test  the 
robustness  of  these  algorithms  for  non-cardinal  edge  orientations.  Note  that  these  images 
have  perfectly  correlated  color  planes.  The  intent  of  these  images  is  to  highlight  alias-induced 
fringing  errors. 


Fig.  7.  Type  I  Test  Images,  a)  Test  Imagei  has  vertical  bars  with  decreasing  thicknesses(16  pixels  down  to 
1  pixel)  b)  Test  Image2  has  bars  of  constant  width  (3  pixels)  (c)  Test  Images  is  a  section  from  the  starburst 
pattern 

B.  Type  II  Images 

Three  RGB  images,  shown  in  fig.8  were  subsampled  in  the  form  of  a  Bayer  array  and 
then  interpolated  to  get  the  three  color  planes.  The  regions  of  interest  (ROIs)  in  this  image 
has  been  highlighted  with  a  white  box.  These  images  were  chosen  specifically  to  highlight 


(a)  (b)  (c) 

Fig.  8.  Type  II  Images,  (a)  Test  Image4  (b)  Original  RGB  Macaw  Image  showing  ROIs  (c)  Original  Crayon 
Image  showing  ROIs 

the  behavior  of  these  algorithms  when  presented  with  color  edges.  Test  Image4  is  a  synthetic 
image  of  randomly  chosen  color  patches.  Unlike  Type  I  images,  these  images  have  sharp 
discontinuities  in  all  color  planes,  independent  of  each  other.  The  ROIs  in  fig.8(b)  have 
relatively  high  spatial  frequencies.  The  ROIs  in  fig.8(c)  have  distinct  color  edges,  one  between 
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pastel  colors  and  the  other  between  fully  saturated  colors. 

C.  Type  III  Images 

This  category  of  images  consists  of  real-world  camera  images  captured  with  a  camera  that 
has  a  CPA  pattern.  No  internal  interpolation  is  performed  on  them.  We  were  therefore  able 
to  get  the  “true”  CPA  imagery  corrupted  only  by  the  optical  PSP.  The  ROIs  of  these  images 
are  shown  in  figs. 17(a)  and  18(a).  CPAi  has  sharp  edges  and  high  frequency  components 
while  CPA2  has  a  color  edge. 

IV.  Results 

The  results  of  the  demosaicking  algorithms  presented  in  section  II  on  the  three  types  of 
images  are  shown  in  figs.9  to  18.  Literature  [16]  suggests  that  the  AE*^  (definition  included 
in  the  Appendix)  error  metric  represents  human  perception  effectively.  We  hence  make 
use  of  this  to  quantify  the  errors  observed.  However,  bear  in  mind  the  bounds  on  t.Vii.c! 
error  for  detectability  that  errors  less  than  about  2.3  are  not  easily  detected  while 
on  the  other  hand,  errors  greater  than  about  10  are  so  large  that  relative  comparison  is 
insignificant  [17].  This  metric  gives  us  a  measure  of  the  difference  between  colors  as  viewed 
by  a  standard  observer.  Another  metric  used  for  comparison  is  the  mean  squared  error 
(MSE)  which  provides  differences  between  colors  in  a  “Euchdean”  sense.  MSE,  although 
not  being  representative  of  the  errors  we  perceive,  is  popular  because  of  its  tractabihty  and 
ease  in  implementation.  These  metrics  are  tabulated  in  Tables  I  and  II.  The  boldface 
numbers  represent  the  minimum  values  in  the  corresponding  image,  which  gives  us  an  idea 
about  which  algorithm  performs  best  for  a  given  image.  There  will  be  errors  introduced  in 
the  printing/reproduction  process,  but  assuming  that  the  errors  will  be  consistent  for  all  the 
reproductions,  we  may  infer  relative  performance  of  these  algorithms. 
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In  fig.9  and  fig.  10,  notice  the  fringe  artifacts  introduced  in  linear  interpolation,  termed 
as  the  zipper  effect  by  Adams  [12].  The  appearance  of  this  effect  is  considerably  reduced 
(observe  the  decrease  in  the  metrics)  in  Cok’s  interpolation.  Hamilton-Adams’  and  Laroche- 
Prescott’s  implementation  estimates  Test  Image2  exactly  (notice  that  the  MSE  and  AE*f^ 
errors  are  zero).  This  is  because  both  these  algorithms  use  information  from  the  other  chan¬ 
nels  for  estimation  (chrominance  channel  to  interpolate  luminance  and  vice  versa).  Notice 
that  all  these  algorithms  perform  poorly  at  high  spatial  frequencies.  All  the  algorithms 
discussed  here  have  identical  properties  in  the  horizontal  and  vertical  directions. 

For  non-cardinal  edge  orientations  such  as  those  shown  in  Test  hnagea  (fig.  11)  perfor¬ 
mance  (observed  in  the  error  metrics  also)  is  noted  to  be  worse.  Note  that  the  AE^,  error 
metric  is  “on  an  average”  considerably  higher  for  Test  Images  when  compared  to  Test  Imagei 
and  Test  Images. 


Algorithm 

used 

Test 

Imagei 

Test 

Image2 

Test 

Images 

Test 

Image4 

Macaw 

ROIi 

Macaw 

ROI2 

Crayon 

ROIi 

Crayon 

ROI2 

Linear 

34.731 

65.487 

57.553 

9.711 

15.457 

23.299 

7.293 

3.645 

Cok 

16.352 

27.122 

30.828 

11.437 

11.017 

14.924 

6.003 

4.131 

Freemon 

15.179 

55.301 

19.513 

9.599 

5.404 

7.421 

4.649 

3.645 

Laroche- 

Prescott 

7.321 

0 

24.592 

10.944 

11.028 

14.198 

5.507 

4.234 

Hamilton- 

Adams 

3.052 

0 

21.793 

9.303 

9.279 

11.579 

4.409 

3.936 

TABLE  I 


ERRORS  FOR  DIFFERENT  INTERPOLATION  ALGORITHMS  AFTER  DEMOSAICKING 


Test  Image4  has  been  used  to  illustrate  the  performance  of  these  algorithms  when  pre¬ 
sented  with  sharp  edges  which  do  not  have  correlated  color  planes  (see  fig.  12.  Prom  the 
error  metrics,  it  is  clear  that  all  of  them  perform  poorly  at  sharp  color  edges.  Note  however 
that  although  the  AE*f,  errors  are  high,  the  squared  error  metric  is  relatively  low,  clearly 
highlighting  the  advantage  of  using  Using  only  the  squared  error  would  have  been 
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Algorithm 

used 

Test 

Imagei 

Test 

Image2 

Test 

Images 

Test 

Image4 

Macaw 

ROIi 

Macaw 

ROI2 

Crayon 

ROIi 

Crayon 

ROI2 

Linear 

154 

253 

101.6 

18.1 

33.0 

68.6 

10.4 

1.7 

Cok 

100 

163 

67.3 

31.0 

20.5 

37.5 

6.7 

2.1 

lYeeman 

52.2 

134 

5.7 

19.9 

3.9 

3.4 

2.8 

1.6 

Laroche- 

Prescott 

35.3 

0 

8.8 

26.2 

20.1 

31.5 

5.8 

1.9 

Hamilton- 

Adams 

21.4 

0 

8.3 

26.6 

11.7 

10.5 

3.3 

1.9 

TABLE  II 


MSE  (X  10“^)  FOR  DIFFERENT  INTERPOLATION  ALGORITHMS  AFTER  DEMOSAICKING 


(a)  (b)  (c)  (d)  (e) 

Fig.  9.  (a) Linear  (b)Cok  (c)fVeeman  (d)Laroche-Prescott  (e) Hamilton- Adams  interpolations  on  Test  Imagei . 
Note:  Images  are  not  the  same  size  as  original.  Image  has  been  cropped  to  hide  edge  effects 


5  10  15  20  25  5  10  15  20  25  5  10  15  20  25  5  10  15  20  25  5  10  15  20  25 


(a)  (b)  (c)  (d)  (e) 

Fig.  10.  (a) Linear  (b)Cok  (c) Freeman  (d)Laroche-Prescott  (e) Hamilton- Adams  interpolations  on  Test 

image2.  Note:  Images  are  not  the  same  size  as  original.  Image  has  been  cropped  to  hide  edge  effects 
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Fig.  11.  (a)Linear  (b)Cok  (c)EVeeman  (d)Laroche-Prescott  (e) Hamilton- Adams  interpolations  on  Test 
Images.  Note:  Images  are  not  the  same  size  as  original.  Image  has  been  cropped  to  hide  edge  effects 


(a)  (b)  (c)  (d)  (e) 

Fig.  12.  (a)Linear  (b)Cok  (c)FVeeraan  (d)Laroche-Prescott  (e)Hamilton-Adams  interpolations  on  Test 
lmage4.  Note:  Images  are  not  the  same  size  as  original.  Image  has  been  cropped  to  hide  edge  effects 


misleading. 

The  macaw  images  illustrate  the  alias-induced  errors  while  at  the  same  time,  showing 
a  confetti  type  of  error.  These  errors  come  about  due  to  intensely  bright  or  dark  points  (in 
a  dark  or  bright  neighborhood,  respectively).  Freeman’s  algorithm  performs  best  in  these 
regions  because  it  is  able  to  remove  such  “speckle”  behavior  in  the  images  due  to  the  median 
filtering  process  (observe  that  the  errors  are  smallest  for  Freeman’s  algorithms  in  such 
regions). 

The  crayon  images  on  the  other  hand  are  reproduced  very  precisely  (see  fig.  15  and 
fig.16),  with  very  few  errors.  ROIi  shows  some  errors  at  the  edges  where  the  line-art  appears. 
However,  this  error  is  not  very  evident.  ROI2  is  reproduced  almost  exactly.  In  fact,  depending 
upon  the  print  process  or  the  display  rendering  process,  one  may  not  be  able  to  see  the  errors 
generated  at  all.  This  shows  that  these  algorithms  perform  very  well  at  blurred  color  edges 
(which  is  the  case  with  many  natural  scenes). 
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(a)  (b)  (c)  (d)  (e)  (f) 

Fig.  13.  (a)  Original  “truth”  ROIi  of  macaw  image  (b)Lmear  (c)Cok  (d)Preeman  (e)Laroche-Prescott 

(f)Hamilton-Adams  interpolations  on  Macaw  Image.  Note:  Images  are  displayed  along  with  original  image 
for  comparison  purposes 


(a)  (b)  (c)  (d)  (e)  (f) 

Fig.  14.  (a)  Original  “truth”  ROI2  of  macaw  image  (b)Linear  (c)Cok  (d)fVeeman  (e)Laroche-Prescott 

(f)Hamilton-Adams  interpolations  on  Macaw  Image.  Note:  Images  are  displayed  along  with  original  image 
for  comparison  purposes 


In  Type  III  images  which  are  raw  readouts  from  a  CFA  camera,  we  cannot  use  the 
metrics  we  have  been  using  thus  far  as  there  is  no  “reference”  image  with  which  to  compare 
these  results.  However  we  may  use  visual  cues  to  determine  performance,  and  we  observe 
similar  trends  in  these  images  as  was  observed  in  synthetic  images.  Observe  in  fig.17  that 
the  high  spatial  frequencies  and  non-cardinal  edge  orientations  are  not  reproduced  correctly 
(as  was  the  case  with  Type  I  images).  Color  edges  are  also  reproduced  with  reasonably  good 
fidelity  as  is  seen  in  fig.  18  -  although  some  zipper  effect  is  observed  with  Linear  and  Cok 
interpolations. 


Fig.  15.  (a)  Original  “truth”  ROIi  of  crayon  image  (b)Linear  (c)Cok  (d)Freeman  (e)Laroche-Prescott 

(f) Hamilton- Adams  interpolations  on  Macaw  Image.  Note:  Images  are  displayed  along  with  original  image 
for  comparison  purposes 


Fig.  16.  (a)  Original  “truth”  ROI2  of  crayon  image  (b) Linear  (c)Cok  (d)Freeman  (e)Laroche-Prescott 

(f)Hamilton- Adams  interpolations  on  Macaw  Image.  Note:  Images  are  displayed  along  with  original  image 
for  comparison  purposes 
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(d)  (e)  (f) 

Fig.  17.  (a)  Original  image  CFAi  (b)Linear  (c)Cok  (d)fVeeman  (e)Laroche-Prescott  (f) Hamilton- Adams 

interpolations 


(d)  (e)  (f) 

Fig.  18.  (a)  Original  image  CFA2  (b)  Linear  (c)Cok  (d)  Freeman  (e)Laroche-Prescott  (f)  Hamilton- Adams 

interpolations 
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V.  Discussion 

Laroche-Prescott’s  and  Hamilton-Adams’  interpolation  processes  have  similar  forms. 
Both  of  them  use  second  derivatives  to  perform  interpolation  which  may  be  written  as 

v  =  u  +  Xg  (9) 

where  u  is  the  data  (original  image),  v  is  the  resulting  image  A  >  0,  5^  is  a  suitably  defined 

gradient.  We  may  think  of  eqn.9  in  the  form  of  that  used  for  unsharp  masking  [18],  an 

enhancement  process.  Unsharp  masking  may  be  interpreted  as  either  subtraction  of  the 
low-pass  image  from  the  original  image  (scaled)  or  of  even  as  addition  of  a  high-pass  image 
to  the  original  image  (scaled).  To  see  the  equivalence  let  the  image  I  be  written  as 

I  =  L  +  H  (10) 

the  sum  of  its  low-pass  (L)  and  high-pass  (H)  components.  Now,  define  unsharp  masking 
by 

F  =  al-L 

=  (o-l)/  +  /-L  (11) 

=  {a-l)I  +  H 

which  has  a  form  similar  to  that  in  eqn.9.  Hence,  one  of  the  many  ways  to  interpret  Laroche- 
Prescott’s  and  Hamilton-Adams’  algorithms,  is  an  unsharp  masking  process.  It  may  hence 
be  expected  that  these  processes  will  sharpen  edges  (only  those  in  the  cardinal  directions, 
due  to  the  manner  in  which  they  are  implemented)  in  the  resulting  images  as  is  observed  in 
the  results  obtained  from  Laroche-Prescott’s  and  Hamilton-Adams’  interpolations  (figs. 9  to 
18). 

Prom  Tables  I  and  II,  on  the  basis  of  simple  majority.  Freeman’s  algorithm  outperforms 
the  other  algorithms.  On  the  other  hand,  in  two  cases,  it  performs  very  poorly. 
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For  Test  Imagei,  as  can  be  seen  from  fig.9,  Linear  interpolation  produces  the  zipper 
effect  that  had  been  mentioned  earlier.  This  is  because  hnear  interpolation  is  a  low  pass 
filter  process  and  hence  incorrectly  locates  the  edges  in  each  color  plane,  introducing  zipper 
[12].  Cok’s  interpolation  reduces  hue  transitions  over  the  edges  since  it  interpolates  the 
hue  of  the  colors  and  not  the  colors  themselves  which  reduces  abrupt  hue  jumps  producing 
fewer  perceptual  artifacts.  Freeman’s  algorithm,  using  the  median  as  an  estimator,  performs 
poorly  because  it  first  performs  a  hnear  interpolation  for  the  green  channel  (a  blur  process), 
also  introducing  ripples.  Laroche-Prescott’s  algorithm,  using  classifiers  to  interpolate  in  the 
preferred  orientation  reduces  errors.  Also,  interpolating  color  differences  (chrominance  minus 
luminance),  it  utilizes  information  from  two  channels  to  precisely  locate  the  edge.  Hamilton- 
Adams’  algorithm  interpolates  the  luminance  channel  with  a  bias  to  the  second  derivative  of 
the  chrominance  channel,  locating  the  edge  in  the  three  color  planes  with  better  accuracy. 

In  Test  Image2,  although  we  find  the  same  trend  in  Linear  and  Cok  interpolations  as  we 
did  in  Test  Imagei,  we  find  that  Laroche-Prescott’s  and  Hamilton- Adams’  algorithms  are 
able  to  reproduce  the  image  exactly.  This  is  attributed  to  the  structure  (and  size)  of  their 
estimators  and  the  width  of  the  bars  themselves  (3  pixels). 

In  the  Test  Images,  there  are  two  factors  that  the  algorithms  are  tested  against,  one  is 
varying  spatial  frequencies  and  the  other  being  non-cardinal  edge  orientations.  Comparing 
figs. 9  and  10  with  fig.ll,  we  observe  that  vertical  and  horizontal  directions  are  reproduced 
with  good  clarity  while  edges  along  other  orientations  are  not,  alluding  to  the  fact  that 
almost  all  these  algorithms  (with  the  exception  of  Hamilton- Adams’,  which  incorporates 
some  diagonal  edge  information)  are  optimized  for  horizontal  and  vertical  edge  orientations. 
A  similar  observation  is  made  for  the  CFA  images. 

Note  that  in  Test  Image4,  the  edge  between  the  two  green  patches  has  been  estimated 
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with  good  accuracy  by  Laroche-Prescott’s  and  Hamilton-Adams’  algorithms.  This  is  at¬ 
tributed  to  the  fact  that  these  two  algorithms,  unhke  the  others,  use  data  from  all  the  color 
planes  for  estimation.  In  this  case,  the  data  on  either  side  of  the  edge  being  “similar”,  the 
estimate  was  correct. 

Another  trend  observed  is  that  Hamilton-Adams’  algorithm  performs  better  than  Laroche- 
Prescott’s  algorithm.  This  is  attributed  to  two  reasons;  one  that  the  process  of  estimating 
the  green  channels  in  Hamilton-Adams’  algorithm  incorporates  the  second  order  graxiient  in 
the  chrominance  channels  also,  providing  a  better  estimate  while  Laroche-Prescott’s  algo¬ 
rithm  simply  performs  a  prefential  averaging.  The  second  reason  is  that  Hamilton-Adams’ 
algorithm  estimates  diagonal  edges  while  estimating  the  chrominance  chaimels,  giving  it 
more  sensitivity  to  non-cardinal  chrominance  gradients  (which  partially  explains  the  slightly 
smaller  error  metrics  for  Test  Images). 

VI.  Conclusion 

It  has  been  demonstrated  that  although  the  CFA  pattern  is  very  useful  to  capture  multi- 
spectral  data  on  a  monolithic  array,  this  system  comes  with  associated  problems  of  “missing 
samples” .  The  estimation  of  these  missing  samples  needs  to  be  done  in  an  efficient  manner, 
at  the  same  time,  reproducing  the  original  images  with  high  fidelity. 

In  general,  we  observe  two  types  of  errors  zipper  effect  errors  (occur  at  intensity  edges 
see  fig.9  for  this  behavior)  confetti  errors  (occur  at  bright  pixels  surrounded  by  a  darker 
neighborhood  see  figs. 14  and  13  for  this  behavior).  Experimentally,  it  has  been  found  that 
Freeman’s  algorithm  is  best  suited  for  cases  in  which  there  is  speckle  behavior  in  the  image, 
while  Laroche-Prescott’s  and  Hamilton-Adams’  algorithms  are  best  suited  for  images  with 
sharp  edges. 

It  is  to  be  noted  that  demosaicking  is  not  shift-invariant.  Different  results  are  observed 
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if  the  location  of  the  edges  is  phase-shifted  (the  zipper  effect  errors  show  up  either  as  blue- 
cyan  errors  or  as  orange-yellow  errors  depending  upon  edge-location,  see  fig.9).  The  result 
of  demosaicking  is  hence  a  function  of  the  edge  location. 


Appendix 


Two  of  the  color  models  suggested  by  the  CIE  which  are  perceptually  balanced  and 
uniform  are  the  CIE-L*u*v*  and  the  GIE-L*a*b*  color  models.  The  CIE-L*u*v*  model  is 
based  on  the  work  by  MacAdams  on  the  Just  Noticeable  Differences  in  color  [16].  These 
color  models  are  non-linear  transformations  of  the  XYZ  color  model.  The  transformation 
from  the  XYZ  space  to  the  CIE-L*a*b*  space  is  given  by 


for  ^  >  0.008856 
otherwise 


where  X„,  Y»,  Z„  are  the  values  of  X,  Y,  Z,  for  the  appropriately  chosen  reference 
white;  and  where,  if  any  of  the  ratios  (X/Xn),  {y(Yn)  or  (Z/Zn)  is  less  than  or  equal  to 
0.008856,  it  is  replaced  in  the  above  formula  by  7.787F-I- 16/116  where  F  is  (X/A„),  (Y/Yn) 
or  (Z/Zn)  as  the  case  may  be.  The  color  differences  in  the  CIE-L*a*b*  color  space  are  given 
by  AE*t,  =  ^{AL*)^  +  {Aa*y  +  {Ab*y. 
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Abstract 

RAMANATH,  RAJEEV.  Interpolation  Methods  for  the  Bayer  Color  Array  (under  the 
guidance  of  Dr.  Wesley  E.  Snyder)  Digital  still  color  cameras  working  on  single  CCD- 
based  systems  have  a  mosaicked  mask  of  color  filters  on  the  sensors.  The  Bayer  array 
configuration  for  the  filters  is  popularly  used.  This  requires  that  the  data  be  interpolated 
to  recover  all  the  scene  information.  Many  existing  interpolation  (demosaicking) 
algorithms  that  can  reconstruct  the  scene  use  modifications  of  the  bilinear  interpolation 
method,  intro-ducing  a  variety  of  artifacts  in  the  images.  These  algorithms  have  been 
investi-gated.  A  new  method  for  restoring  these  color  images  using  an  optimization 
method  known  as  Mean  Field  Annealing  is  introduced  using  a  variety  of  image  prior 
models.  Their  performance  relative  to  existing  demosaicking  methods  is  included. 


